Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: report dependency not installed to kuadrant & policy status #994

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

KevFan
Copy link
Contributor

@KevFan KevFan commented Nov 8, 2024

Description

Closes: #730

Report in the Kuadrant and Policy status when required dependencies are not installed

  • Kuadrant CR
    • Gateway API not installed
    • Gateway Provider (istio / envoy gateway) not intalled
    • Limitador Operator not installed
    • Authorino Operator not installed
  • AuthPolicy CR
    • Gateway API not installed
    • Gateway Provider (istio / envoy gateway) not intalled
    • Authorino Operator not installed
  • RateLimitPolicy CR
    • Gateway API not installed
    • Gateway Provider (istio / envoy gateway) not intalled
    • Limitador Operator not installed
  • TLSPolicy CR (does not require a Gateway Provider)
    • Gateway API not installed
    • CertManager not installed
  • DNSPolicy CR (does not require a Gateway Provider)
    • Gateway API not installed
    • DNS Operator not installed

Verification

  • Gateway API and Gateway API provider status are tested by integration tests, so passing integration tests should be enough

To verify other status:

  • Create cluster
make local-setup
  • Delete CRDs to simulate not installed
# DNS Operator
kubectl delete crd dnsrecords.kuadrant.io   
# Authorino Operator                                                                    
kubectl delete crd authconfigs.authorino.kuadrant.io
kubectl delete crd authorinos.operator.authorino.kuadrant.io
# Limitador Operator
kubectl delete crd limitadors.limitador.kuadrant.io
# Cert Manager
kubectl delete crd certificates.cert-manager.io
  • Apply CRs
kubectl apply -f config/samples/kuadrant_v1_authpolicy.yaml                                      
kubectl apply -f config/samples/kuadrant_v1_dnspolicy.yaml
kubectl apply -f config/samples/kuadrant_v1_ratelimitpolicy.yaml
kubectl apply -f config/samples/kuadrant_v1_tlspolicy.yaml
kubectl apply -f config/samples/kuadrant_v1beta1_kuadrant.yaml
  • Check status
kubectl get kuadrant kuadrant-sample -o yaml | yq '.status'
kubectl get authpolicy authpolicy-sample -o yaml | yq '.status'
kubectl get ratelimitpolicy ratelimitpolicy-sample -o yaml | yq '.status'
kubectl get dnspolicy dnspolicy-sample -o yaml | yq '.status'
kubectl get tlspolicy tlspolicy-sample -o yaml | yq '.status'
image

@KevFan KevFan force-pushed the issues/730 branch 7 times, most recently from d01d7fc to 90b4385 Compare November 15, 2024 09:18
@KevFan KevFan self-assigned this Nov 15, 2024
@KevFan KevFan added kind/enhancement New feature or request size/medium labels Nov 15, 2024
@KevFan KevFan changed the title Issues/730 enhancement: report dependency not installed to kuadrant & policy status Nov 15, 2024
@KevFan KevFan marked this pull request as ready for review November 15, 2024 15:37
@Boomatang
Copy link
Contributor

I think there is some improvement needed here. Firstly I rebased this to current main, and everything worked as expected.

The simpler issue is around the message in the status block Limitador Operator is not installed, please restart pod once dependency has been installed. When I first seen this message I ask what pod do I restart. I think we need to be clear that it is the kuadrant operator that the user should restart. This might not be as important for the kuadrant CR directly, but on the policy CR's I think it is need.

The second issue is are clarity, but raises its own issue. Doing the example above the kuadrant CR states that Limitador Operator is missing, but all the resources are missing. The user would fix one issue, restart the operator only to find the next issue. Where they would have to repeat the process over and over. This is not great. I think we should be telling the user all the dependencies that are missing at the same time.

The third point and I don't think it is an issue but more of an opportunity. When I tried the example give, at first I didn't make every CR type, I only made the one for auth. As a user the only on thing I am currently using kuadrant for is auth. Why do I care if limitador and DNS operators are not working. Wouldn't it be nice if kuadrant only told you had a dependency issue if you apply resources to the cluster that depends on that dependency. I also think this could help us on the part of solving this issue: #71

While I am at it I think we might need more log level than info & error for production. I think there needs to be a warning level.

{"level":"info","ts":"2025-01-08T14:21:52Z","logger":"kuadrant-operator","msg":"envoygateway is not installed, skipping related watches and reconcilers","err":null}

The above message is from the logs in pr, and I think it should be a warning over info level log. This possible is out side the scope of this PR, but it is a good example of what would be a warning.

One last point, which is splitting hairs, technically the Limitador Operator is not installed is wrong. It is the limitador CRD that is not installed. Do we want to get that fine grained, I don't think so.

@KevFan
Copy link
Contributor Author

KevFan commented Jan 9, 2025

The simpler issue is around the message in the status block Limitador Operator is not installed, please restart pod once dependency has been installed. When I first seen this message I ask what pod do I restart. I think we need to be clear that it is the kuadrant operator that the user should restart. This might not be as important for the kuadrant CR directly, but on the policy CR's I think it is need.

I think this is a good shout. Are we thinking of just calling it "Kuadrant Operator pod"? I assume this is alright for the downstream also 🤔

The second issue is are clarity, but raises its own issue. Doing the example above the kuadrant CR states that Limitador Operator is missing, but all the resources are missing. The user would fix one issue, restart the operator only to find the next issue. Where they would have to repeat the process over and over. This is not great. I think we should be telling the user all the dependencies that are missing at the same time.

Yeah, this makes sense 👍

The third point and I don't think it is an issue but more of an opportunity. When I tried the example give, at first I didn't make every CR type, I only made the one for auth. As a user the only on thing I am currently using kuadrant for is auth. Why do I care if limitador and DNS operators are not working. Wouldn't it be nice if kuadrant only told you had a dependency issue if you apply resources to the cluster that depends on that dependency. I also think this could help us on the part of solving this issue: #71

This one, I'm not sure I understand the scenario you are saying. If you are doing Auth only, this missing dependency will only be reported in Kuadrant and AuthPolicy CR status. If you didn't apply Kuadrant CR, then it will be reported in AuthPolicy CR status only. Kuadrant CR cares about limitador and authorino since it creates an instance of those CRs. The current changes should already provide what you are saying - CRs that kuadrant provide should only report missing dependencies that each CR depend on - unless I'm missing something and you are saying this is not the case 🤔

While I am at it I think we might need more log level than info & error for production. I think there needs to be a warning level.

{"level":"info","ts":"2025-01-08T14:21:52Z","logger":"kuadrant-operator","msg":"envoygateway is not installed, skipping related watches and reconcilers","err":null}

The above message is from the logs in pr, and I think it should be a warning over info level log. This possible is out side the scope of this PR, but it is a good example of what would be a warning.

This would be outside of scope for sure. I don't believe the logger that we use even has a warning log type currently.

One last point, which is splitting hairs, technically the Limitador Operator is not installed is wrong. It is the limitador CRD that is not installed. Do we want to get that fine grained, I don't think so.

This is true, I don't think we want to be that fine grained. Might be able to improve this seperately, but outside of scope for this PR I would say. Kuadrant depends more on the CRD but would also have issues if the Operator is not running. Since we generally report status based on the status of the dependant CRs, I think we indicate something is unexepcted from the environment as our CR status will never go into a Ready/Enforced state 🤔 I can rename if that is our general preference everywhere. Will be a seperate issue/PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request size/medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[state-of-the-world] common policy status controller
2 participants